-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UCT/IB/MLX5/DC: introducing dcs_hybrid policy #10138
Conversation
src/uct/ib/mlx5/dc/dc_mlx5_ep.h
Outdated
uct_dc_mlx5_dci_pool_init_dci(iface, uct_dc_mlx5_ep_pool_index(ep), | ||
UCT_DC_MLX5_HW_DCI); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean HW DCS is the first DCI, that is most likely to be used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/uct/ib/mlx5/dc/dc_mlx5_ep.h
Outdated
if (!uct_dc_mlx5_iface_is_hybrid(iface)) { | ||
return UCS_ERR_NO_RESOURCE; | ||
} | ||
|
||
if (ep->dci == UCT_DC_MLX5_HW_DCI) { | ||
return UCS_OK; | ||
} | ||
|
||
if (uct_dc_mlx5_iface_dci_has_tx_resources(iface, UCT_DC_MLX5_HW_DCI)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we try to reduce number of branches on the fast path?
Add hw_dci attribute to iface which will be -1 if hybrid was not selected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you create some DC specific gtests for this new mode?
@@ -901,7 +903,7 @@ uct_dc_mlx5_iface_dcis_create(uct_dc_mlx5_iface_t *iface, | |||
|
|||
ucs_array_length(&iface->tx.dcis) = 0; | |||
|
|||
status = uct_dc_mlx5_iface_create_dci(iface, 0, 0); | |||
status = uct_dc_mlx5_iface_create_dci(iface, 0, 0, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why just one channel and not self->tx.num_dci_channels
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter as this dci is only created to query bb_max
and then destroyed (lines 913-914)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ic, probably need to change the name of this func, as it is quite confusing currently
src/uct/ib/mlx5/dc/dc_mlx5_ep.h
Outdated
if (uct_dc_mlx5_iface_is_dci_shared(iface) || | ||
uct_dc_mlx5_is_hw_dci(iface, ep->dci)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems these two check are frequently used together. separate func is worth creating
src/uct/ib/mlx5/dc/dc_mlx5_ep.h
Outdated
return uct_dc_mlx5_iface_is_dci_shared(iface) || | ||
uct_dc_mlx5_is_hw_dci(iface, dci); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we optimize this to be a single branch? it is used in uct_dc_mlx5_iface_dci_get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess replacing ||
with +
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both these checks are checking the policy field, maybe we can a bitmap based check or something like "policy >= UCT_DC_TX_POLICY_DCS_HYBRID" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is_hw_dci just checks dci == iface->tx.hybrid_hw_dci
so >= DCS_HYBRID
won't help, we can just use bitwise |
between is_shared and is_hw_dci if that's somehow better
src/uct/ib/mlx5/dc/dc_mlx5.h
Outdated
@@ -212,6 +212,7 @@ typedef struct uct_dc_dci { | |||
uint8_t path_index; /* Path index */ | |||
uint8_t next_channel_index; /* next DCI channel index | |||
to be used by EP */ | |||
uint8_t is_shared; /* Indicates that this specific dci is shared, regardless of policy*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- space before *
- line seems too long
- maybe make it uint8_t flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we can't avoid that extra branch because now we don't know if dci is UCT_DC_MLX5_EP_NO_DCI
, resulting in sanitizer failure (accessing dcis[255])
src/uct/ib/mlx5/dc/dc_mlx5_ep.h
Outdated
@@ -366,6 +365,8 @@ uct_dc_mlx5_dci_pool_init_dci(uct_dc_mlx5_iface_t *iface, uint8_t pool_index, | |||
return status; | |||
} | |||
|
|||
dci->is_shared = uct_dc_mlx5_iface_is_policy_shared(iface) || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set it later - initialize according to fields order in the struct
ucs_arbiter_t *waitq; | ||
uct_rc_txqp_t *txqp; | ||
int16_t available; | ||
|
||
ucs_assert(!iface->super.super.config.tx_moderation); | ||
|
||
if (uct_dc_mlx5_iface_is_dci_shared(iface)) { | ||
if (ep->dci == UCT_DC_MLX5_EP_NO_DCI) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove assert+comment in line 745?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls also update or remove the comment in that line
Signed-off-by: Roie Danino <[email protected]>
9f006eb
to
e558c11
Compare
What
Introducing a new dci allocation policy - dcs_hybrid which will behave the same as dcs_quota except using a dedicated HW dci when available instead of waiting for a dci from the pool to be available
Why ?
Utilizing hw resources instead of waiting for another dci, improves latency for certain message sizes
How ?
In case
dci_alloc_or_create
returned 0 - use the dedicated HW dci with a new dci channel.